An exemplar model of classification and typicality in simple and combined categories
نویسنده
چکیده
This paper describes a new model of classification and typicality which explains a number of empirical results in simple and combined categories. In this exemplarbased model, classification is based on the diagnostic and plausible evidence of attributes. The model explains the family resemblance structure of simple categories, the overextension of typicality in some combined categories, and people's ability to judge typicality in "empty" combined categories (combinations of categories which have no members in common). The model also explains the influence of category overlap on the overextension of combined categories, and the observed dominance of some categories in combination. Finally, the model accounts for the interesting finding that "emergent" attributes in combination (attributes true of a combination but not of its constituents) are often accessed more rapidly than non-emergent attributes (true of the combination and its constituents). An exemplar model of classification 3 In this article I describe a model of two fundamental aspects of human cognition: our ability to classify instances as members of simple categories like fish or cat, and our ability to manipulate and combine those simple categories, to classify instances as members of complex categories like mountain cat or pet fish. In this model, simple categorisation and category combination are both produced by a single exemplar-based mechanism. This model accounts for three important empirical regularities in classification and typicality: that membership typicality in simple categories is graded by family resemblance (Rosch, 1978; Rosch & Mervis, 1975); that membership is also graded in combined categories, even "empty" combined categories for which no instances are known (Rips, 1995); and that membership typicality in combined categories sometimes shows overextension, with instances which are classified as bad members of both constituents of a combined category being classified as good members of the combination (Osherson & Smith, 1981, 1982; Hampton, 1987, 1988). Various models have been proposed to explain one or another of these aspects of classification and typicality; no other current model provides a unified explanation for all three. This article has five sections. In the first , I review the three empirical regularities of family resemblance in simple categories, classification in "empty" combinations, and overextension in combined categories. I also outline the different models which have been proposed to account for these regularities. In the second section I describe the new exemplar-based model, which provides a unified explanation for the three regularities. In the third section I describe in detail how this model, called the "diagnostic and plausible evidence" model, accounts for a number of specific results in the overextension of combined categories. In the fourth section I extend the new model from classification to inference, showing how the model explains the interesting finding that people often infer "emergent" attributes for combined categories (attributes which are true of the combination but not of its constituent categories on their own) as fast or faster than typical attributes (attributes both true of the combination and true of An exemplar model of classification 4 its constituents; Springer & Murphy, 1992; Gagné & Murphy, 1996). In the final section I discuss the implications the new model has for more general issues in language and thought. Empirical Regularities in Simple and Combined Categories The psychological literature contains a wealth of studies investigating various aspects of people's simple and combined categorisation. Some studies have focused on people's ability to make inferences about simple or combined categories (to infer that fish typically have silvery scales, or that pet fish are typically smaller than other fish, for example; see Medin & Shoben, 1988). Others have addressed people's ability to judge the membership typicality of presented items in simple or combined categories (to decide the typicality of salmon or goldfish as members of the simple category "fish" or the combined category "pet fish"). While there is a clear link between people's ability to make category inferences and their ability to make typicality judgements, it is important to be aware that the two processes can sometimes lead in different directions. For example, while people may infer that fish typically have silvery scales, they have no difficulty in recognising instances with brown scales (trout, for example) as typical members of the category "fish". In this section I summarise the empirical regularities observed in people's judgements of membership typicality for simple categories and for combined categories, and discuss various models that have been proposed for these regularities. Simple categories: Typicality and family resemblance Experiments across a number of domains have shown that people's classification of instances in simple categories is ordered by typicality, with some instances being judged better or more typical members of a simple category than others. To take one example, it seems natural to rate salmon, cod or trout as more typical members of the category "fish" than goldfish or sharks. Importantly, the rated typicality of instances in categories has a reliable influence on a number of aspects of An exemplar model of classification 5 category use. For example, level of instance typicality predicts reaction times in sentence verification, order of item production when people are asked to name category instances, and item efficacy in priming tasks (see Komatsu, 1992, for a review). The varying typicality of different instances for simple categories is usually taken to be correlated with the varying family resemblance of those instances for the categories in question. The family resemblance of an instance for a category increases as the similarity between the instance and all members of the category increases, and decreases as the similarity between the instance and members of all other categories increases. Support for a link between family resemblance and typicality comes from Rosch & Mervis's (1975) finding that the rated typicality of instances for simple categories was reliably higher for instances which shared many properties with other members of the category in question, and reliably lower for instances which shared properties with instances of other categories. Perhaps the most successful current models of family resemblance and typicality in simple categories are similarity-based exemplar models. These models assume that people represent categories by storing sets of individual instances or exemplars in memory; classification in a given category is performed relative to the set of stored exemplars of that category. The typicality of an instance in a simple category is computed by relative similarity: an instance is classified as a member of a category if it similar enough to the set of stored exemplars of that category, and dissimilar enough to the set of stored exemplars of other categories. Specific empirical support for similarity-based exemplar models comes from a range of experiments, primarily involving learning and classification in artificial categories (such as patterns of dots, random shapes and so on; see, for example, Estes, 1994; Nosofsky, 1991; Medin & Schaffer, 1978). However, one difficulty for these models arises from the fact that people sometimes attend to some dimensions more than others, giving greater weight to attribute-values on those dimensions in their classification of instances. To account for this, similarity-based exemplar models often allow different dimensions to have An exemplar model of classification 6 different weights in their contribution to similarity. Typically, dimensions which are diagnostic for the categories being learned are given greater weight than less diagnostic dimensions (see, for example, Nosofsky, 1986; Nosofsky, Gluck, Palmeri, McKinley, & Glauthier, 1994). The similarity-based exemplar approach, then, give a good account of typicality judgements in simple categories. However, this approach does not extend well to the case of combined categories. In the next subsection I describe some empirical results in typicality judgements for combined categories, and discuss two models which attempt to account for those results. Combined categories: "Empty" combinations and overextension Current empirical research on category combination can be seen as having two distinct strands, which address different aspects of people's typicality judgements for combined categories. One strand of research investigates people's judgement of membership typicality in intersective combined categories,1 and the relationship between those judgements and membership typicality in the constituent categories of the combinations. This research has found that people reliably "overextend" their typicality judgements in certain combined categories , classifying an instance which is a bad member of both constituents of a combined category as a good member of the combination (Hampton, 1987, 1988, 1991; Chater et al. 1990; Storms, De Boeck, Van Mechelen & Ruts, 1996). To re-use a common example, people rate goldfish or guppies as typical members of the category "pet fish", but as untypical members of both constituent categories "pet" and "fish" (Osherson & Smith, 1981, 1982; Smith & Osherson, 1984). More generally, combined categories often show "compensation", where an instance which is not a member of one constituent of a combination is included as a member of the combined category because it is a highly typical member of the other constituent (Hampton, 1987, 1988, 1991). An exemplar model of classification 7 This research on the overextension of combined categories focuses on intersective combinations, which have "a partial overlap of exemplars", in Hampton's words (Hampton, 1988) . A second strand of research investigates how people interpret novel combined categories, often generated by pairing categories at random (for example, see, Wisniewski & Gentner, 1991; Grey & Smith, 1995). This research demonstrates that people have no difficulty understanding and making inferences about combined categories even when the intersection of those categories is empty, and the categories have no exemplars in common. For example, it is easy to understand the novel combination "pet lobster", even though the categories "pet" and "lobster" do not intersect. While there have been no studies empirically investigating people's typicality judgements for "empty" combinations, it seems to be generally accepted that such typicality judgements exist (see, for example Osherson & Smith's, 1982, "striped apple" example; and see Rips, 1995, for a more general discussion). Theoretical models of typicality in combined categories have taken two general approaches. One approach sees category combination as in some ways similar to set intersection. A recent model of this type is Huttenlocher & Hedges (1994) "joint frequency distribution" model. In this model, categories are represented by sets of exemplars which are distributed by frequency in a multidimensional space. Each category has a category boundary drawn in that space, set to encompass some percentage of its exemplars (e.g. 95%). Categories are centred in regions of high exemplar frequency and boundaries drawn in regions of low frequency; the closer an exemplar is to the region of high frequency for a category, the more typical that exemplar is of the category . Category combination is modelled by the formation of a joint frequency distribution for exemplars across the two combining categories, again with a category boundary set to encompass some percentage of exemplars (e.g. 95%). Exemplars which have a high degree of frequency for both constituents of a combination will have a high degree of frequency, and hence a high degree of typicality, for their combination. An exemplar model of classification 8 The "joint frequency distribution" model gives a good account of the overextension and compensation in intersective combinations. In this model the category boundaries for simple and combined categories are drawn relative to the frequency distribution of exemplars in those categories, exemplars which were excluded by the category boundary for the simple constituents of a combination can be included by the category boundary for their combination. However, the model is unable to account for "empty" combinations such as "pet lobster": since the constituent categories in empty combinations have no exemplars in common, they have no joint frequency distribution. A second approach to category combination sees the typicality of an instance in a combined category as a simple function of that instance's typicality in the constituents of the combination (Zadeh, 1982). For an instance i and categories A and B, instance i's typicality in the combination AB, notated as CAB(i) , is a function f of i's typicality in the constituent categories A and B on their own, as in equation 1: CAB(i) = f( CA(i), CB(i) ) (1) where f is some function such as average, product, or min. This "simple function" approach can explain classification in "empty" combinations: an instance's degree of membership in the combination "pet lobster" would be a function of its membership in "pet" and in "lobster" on their own. If the instance had a good degree of membership in "pet" and a good degree of membership in "lobster", it would have a good degree of membership in the combination "pet lobster". While the simple function approach can give an account for "empty" combinations, it is unable to explain the overextension of combined categories such as "pet fish". Since the membership function approach assumes that an instance's membership in a combined category will be low if the instance is a poor member of the constituent categories of the combination, it predicts that instances such as "goldfish" or "guppy", which are atypical members of the categories "pet" and "fish", would also be An exemplar model of classification 9 atypical members of the combination "pet fish". Further, as Osherson & Smith (1981, 1982) argue, this problem cannot be solved by selecting a function other than average, product, or min: there is in principle no simple function which can correctly produce an instance's membership typicality in a combined category from its typicality in the constituents. Current models, then, adopt a "divide and conquer" approach to typicality judgements in simple and combined categories, with each model accounting for some observed regularities in typicality, but no model accounting for all. In the next section I describe a new approach which attempts to provide a unified account for typicality judgements in simple and combined categories. Classification in simple and combined categories: the "diagnostic and plausible evidence" model. Like other exemplar models, the model of classification described here assumes that people represent categories by storing sets of individual exemplars in memory, and that classification takes place relative to those stored exemplars. Unlike other exemplar models, this new model is not based primarily on similarity. In the new model the classification of an instance is based on the combination of diagnostic and plausible evidence provided by the attributes of that instance. This "diagnostic and plausible evidence" model provides a unified explanation for the family resemblance structure of simple categories, for overextension in combined categories, and for typicality judgements in "empty" combinations. The model is presented in four parts, describing the computation of attribute diagnosticity, the combination of evidence from multiple attributes, classification in simple and combined categories, and the computation of the plausibility of evidence. To illustrate the presentation I will use an example set of stored exemplars of categories such as pet, fish, dog and lobster, described in terms of attribute-value pairs on five dimensions: LIVES, FOUND, KEPT-IN, COLOUR, and TEXTURE (see Table 1). These exemplars have category tags, indicating the categories An exemplar model of classification 10 of which they are members. Exemplar 3, for example, represents an instance of the category "goldfish", which is also a member of the categories "fish" and "pet". _______________________________________________ Table 1 about here. _______________________________________________ Attribute Diagnosticity: attributes as evidence for category membership The idea of diagnostic attributes for a category is a familiar one from theories of concepts and categorisation (see, for example, Tversky, 1977). An attribute is diagnostic for a category if it is a successful guide to membership of that category. Diagnostic attributes give evidence for category membership: a new instance possessing a highly diagnostic attribute of a given category is likely to be a member of the category in question. The diagnosticity of an attribute for a category is directly proportional to the frequency with which that attribute occurs in exemplars of that category, and inversely proportional to the frequency with which that attribute occurs in exemplars of other categories. Diagnosticity is defined as in equation 2. Let P be an attribute, C be a category (a set of stored exemplars) and K be category C's contrast set (the set of stored exemplars which are not members of category C). If U is the universe of all stored exemplars of all categories, then K = U C. The diagnosticity of P for category C relative to contrast set K is then equal to the number of exemplars in category C which possess attribute P, divided by the total number of exemplars in C plus the number of exemplars in the contrast set K which possess attribute P: D(P|C|K) = ( 1-| jp − P|) j∈C ∑ |C|+ (1-|kp − P|) k∈K ∑ (2) where jp is exemplar j's value on the dimension of attribute P (if |jp P| = 0 then jp's value on dimension P and attribute P's value are the same). If an attribute P occurs An exemplar model of classification 11 in all exemplars in category C, but no exemplars in C's contrast set, then P is fully diagnostic for C ( D(P|C|K) = 1). Such an attribute is (on the basis of the set of exemplars seen so far) a perfect predictor of membership of C: a new instance possessing that attribute is most likely to be a member of C. Computation of attribute diagnosticity can be demonstrated using the set of exemplars in Table 1. Consider the diagnosticity of the attribute for the category "fish". The attribute occurs in every exemplar in the category "fish" (exemplars 3,4,5,6,7,8), and in 4 exemplars not in the category "fish" (exemplars 1,2, 13,15). The diagnosticity of for the category "fish" relative to the contrast set Kfish (Kfish = U "fish") is thus D(< LIVES IN, WATER > | "fish" |Kfish) = 6 6 + 4 = 0.6 (3) If instances were classified as members of the category "fish" solely on the basis of the attribute , there would be a 0.6 chance of classifying an instance correctly: out of ten instances with the attribute , six would be correctly classified instances of the category "fish", four would be incorrectly classified instances of other categories. Taking another example, consider the diagnosticity of the attribute for the category "fish". The attribute occurs in 3 out of 6 exemplars in the category "fish" (exemplars 6,7,8), and in only 1 exemplar not in the category "fish" (exemplar 1). The diagnosticity of for the category "fish" is as shown in equation 4: D(< FOUND IN,SEA > | "fish" | Kfish) = 3 6 + 1 = 0.43 (4) If instances were classified as members of the category "fish" solely on the basis of the attribute , there would be a 0.43 chance of classifying an instance correctly. Out of six members of the category "fish", the three members which An exemplar model of classification 12 do not have the attribute would not be recognised as members of the category; out of four instances which do have the attribute , one instance would be incorrectly classified as a member of the category "fish". Category Membership: combining evidence from multiple diagnostic attributes. A diagnostic attribute in an instance, then, gives evidence for that instance’s classification in a category. Instances usually contain a number of different attributes, however, which may be more or less diagnostic for the category in question, or diagnostic for other categories. How is the evidence from these attributes combined to produce an overall likelihood of the instance's membership in a category? If an instance possesses one highly diagnostic attribute for a given category (an attribute which occurs in most stored exemplars of the category in question, and in very few exemplars of other categories), this attribute by itself gives very good evidence for classification; evidence from other attributes is more-or-less irrelevant. If an instance possesses a less diagnostic attribute for the category, that attribute gives only slight evidence for classification; evidence from other attributes is very important for classification. Generally, the degree to which other attributes of an instance contribute evidence for classification, relative to one particular attribute, is inversely proportional to the diagnosticity of that particular attribute. This is formalised in equation 5. For a new instance i possessing N attributes, E(i|C|K), the overall evidence for classifying the instance as a member of category C, is defined in terms of the subtractive multiplication of the diagnosticity of instance i's N attributes: E(i|C|K) = 1 − (1 − D(ip|C|K) p=1 N ∏ ) (5) where ip is instance i's value on the dimension P, and K is the contrast set for category C (K= U C, as before). An exemplar model of classification 13 Equation 5 allows highly diagnostic attributes (which provide positive evidence for category membership) to increase the probability that an instance is a member of a category, but does not allow low diagnostic attributes (which provide no evidence for category membership) to decrease the probability of category membership. If i contains attributes which are highly diagnostic of C (relative to contrast set K), then i is likely a member of C. If i contains an attribute P which occurs in all instances of category C and never occurs in instances of the contrast category K (i.e. P is fully diagnostic of C), then by equation 7 E(i|C|K) = 1, irrespective of the other attributes of the instance.2 While the presence of diagnostic attributes for a given category provide evidence for an instance's membership of that category, the presence of non-diagnostic attributes do not negate that classification. If instance i contains an attribute Q which is not diagnostic of C (i.e. D(Q|C|K) = 0), attribute Q will have no effect on the multiplication which makes up E(i|C|K), and classification will depend on the other attributes of the instance. _______________________________________________ Table 2 about here. _______________________________________________ Table 2 demonstrates the computation of combined evidence for membership typicality in the category “fish” using the illustrative universe of exemplars in Table 1. Consider the membership of exemplar 6, "salmon", in the category "fish". Exemplar 6 has five attributes, each of which have different diagnosticities for the category "fish" (as described earlier, has diagnosticity of 0.6, has diagnosticity of 0.43). In the subtractive multiplication of these diagnosticities, the two most diagnostic attributes ( and ) contribute most (lower the multiplication the most). The overall result shows that exemplar 6, "salmon", has a high degree of evidence (0.96) for membership in the category "fish". An exemplar model of classification 14 _______________________________________________ Table 3 about here. _______________________________________________ To take another example, consider the membership of exemplar 3, "goldfish", in the category "fish" (Table 3). While exemplar 3, like exemplar 6, possesses the two most diagnostic attributes for the category "fish" ( and ), the other attributes of exemplar 3 are less diagnostic than the corresponding attributes of exemplar 6 ( is less diagnostic for the category "fish" than ). The overall degree of evidence for exemplar 3, "goldfish", as a member of category "fish" (0.9) is thus less than that for exemplar 6, "salmon". In categories which have no single highly diagnostic attributes, but rather have a range of attributes of medium diagnosticity, equation 4 equally combines evidence for many different attributes in computing evidence for category membership. Equation 4 can thus explain the family resemblance structure of simple categories which are "fuzzy" and have no single highly diagnostic attribute. For example, Table 4 shows the varying degrees of evidence, and hence membership typicality, which exemplars 6, "salmon", 3, "goldfish", 8, "shark" and 16 "doberman" have for the category "fish". This variation is graded by family resemblance with "salmon" and "shark" having the highest degree of evidence for membership (0.96 and 0.94 respectively), followed by "goldfish" (0.9), with "doberman" having a very low degree of evidence (0.23). Equation 4 can also provide a natural account for people's selective attention to certain attributes in classification, assuming that the attributes attended to are highly diagnostic and thus make a bigger contribution to evidence for classification. Finally, direct evidence for the combination of diagnostic attributes in simple categorisation comes from Rosch & Mervis (1975), who found that people's judgement of typicality of instances in simple categories was graded by the number of diagnostic attributes of those categories which the instances possessed. An exemplar model of classification 15 _______________________________________________ Table 4 about here. _______________________________________________ Combined categories: different attributes as evidence for different categories In classifying an instance in a simple category, then, diagnostic attributes give good evidence for membership in the category, and other less diagnostic attributes do not contribute as much evidence for membership. This "diagnostic evidence" approach generalises to the case of combined categories with the observation that an instance can be a member of a combined category if it has attributes giving diagnostic evidence for membership in each category in the combination: some attributes giving evidence for membership in one category, others for membership in others. This approach to membership in combined categories is formalised in equation 6, which provides a measure of the evidence for classifying an instance i as a member of a set of N combined categories C1...CN, An instance will be a member of a combined category C1...CN if it possesses some attributes diagnostic of C1, some other attributes diagnostic of C2, and so on. Let the contrast category K1..N be the set of all instances not in any of the categories C1...CN (K1..N = U C1∪...∪CN). Then the evidence for an instance i being classified as a member of the combined category C1...CN is given by E(i|C1...CN|K1. . N) = E n=1 N ∏ (i|Cn|K1. . N) (6) An instance i is thus likely to be classified as a member of the combined category C1...CN if instance i contains some attributes which are diagnostic for each category Cn relative to the contrast set K1..N. If an instance i does not contain diagnostic attributes for some category Cx (i.e. if E(i|Cx|K1..N)=0) then the instance i will not be classified as a member of the combined category C1...CN (since the An exemplar model of classification 16 evidence for membership of the combined category is the product of the evidence for memberships of the constituents). _______________________________________________ Table 5 about here. _______________________________________________ This account provides a good explanation for the classification of instances in "empty" combinations, such as "pet lobster". An instance will be classified as a member of an empty combination if it possesses attributes which give evidence for it's a member of both constituents of the combination, even if the two categories being combined have no members in common. For example, Table 5 shows the computation of evidence for membership of an instance in the combination "pet lobster", relative to the universe of stored exemplars from Table 1. The instance shown has two attributes ( and ) which are fully diagnostic for the category "lobster" in that example universe of exemplars; the instance thus has maximum evidence for membership in that category. The instance also has a number of attributes with varying degrees of diagnosticity for the category "pet" (, for example, has a diagnosticity for "pet" of 0.64), and so has a relatively high degree of evidence for membership in that category. Overall evidence for membership in the combination "pet lobster" is therefore also high (0.81). This account also provides a good explanation for the overextension of membership typicality in combined categories such as "pet fish". Table 6 shows the classification of exemplar 3, "goldfish" in the combined category "pet fish" and in the simple categories "pet" and "fish" on their own. Notice that the degree of evidence for the "goldfish" exemplar's membership in the combined category is higher than the degree of evidence for membership in either simple categories on their own. This disparity arises because the diagnosticity of attributes for a category in the context of a combined category is higher than the diagnosticity of those same attributes for that category by itself. For example, in the context of the combination "pet fish", the An exemplar model of classification 17 diagnosticity of the attribute for the category "fish" is high (0.43). In the context of the simple category "fish" on its own, the diagnosticity of is lower (0.23). _______________________________________________ Table 6 about here. _______________________________________________ This difference in diagnosticities occurs because the contrast sets for combined categories are smaller than the contrast sets for simple categories. For a combination AB and its constituents A and B, the contrast set for AB (KAB = U A∪Β; all exemplars neither in A nor B) is a subset of both the contrast sets for constituent A (KAB = U A; all exemplars not in A) and that for constituent B (KAB = U Β; all exemplars not in B) . Since the diagnosticity of an attribute for a category falls with the number of occurrences of the attribute in the category's contrast set (equation 2), if the contrast set is small, the diagnosticity of the attribute is going to be high. For example, the contrast set for the simple category "fish", Kfish, in Table 1, consists of exemplars 1,2,9,10,11,12,13,14,15 and 16. The attribute occurs in seven exemplars in that contrast set, and in three exemplars in the category "fish". The diagnosticity of for "fish" is thus D(< FOUND IN, HOUSE > | "fish" | Kfish) = 3 6 + 7 = 0.23 (7) The contrast set for the combined category "pet fish", Kpetfish, consists only of exemplars 1,2, and 16 (the only exemplars which are neither in "pet" nor "fish"). The attribute occurs in only one exemplar in that contrast set, and the diagnosticity of for "fish" is thus D(< FOUND IN, HOUSE > | "fish" | Kpetfish) = 3 6 + 1 = 0.43 (8) An exemplar model of classification 18 I discuss the "diagnostic and plausible evidence" model's account for overextension in more detail later. Plausibility: the reliability of evidence for category membership The previous sections have focused on the use of evidence for classification that is provided by diagnostic attributes. This section addresses the degree to which such diagnostic evidence for classification is reliable. The process of classification and categorisation is dynamic: sometimes new instances should not be placed in existing categories, but rather in newly-created categories all their own. More specifically, the degree to which an instance should be placed in a newly-created category, as opposed to some already-known category, is related to the degree to which that instance is familiar, or plausible. The more familiar or plausible an instance is, the more the categorisation evidence given by its attributes can be trusted. A highly novel or unfamiliar instance, even if it possesses completely diagnostic attributes of some category, cannot be assigned to that category as reliably as a familiar instance which possess those diagnostic attributes; the fact that the instance is highly novel may indicate that it belongs in a new category rather than an already-existing one. The idea of plausibility related to that of instance familiarity or recognition. When presented with a new instance, people can easily recognise it as novel (different from anything they have seen before) or familiar (similar to things they have seen before). Recognising an instance as novel or familiar involves comparing the instance to all stored exemplars (e.g. Nosofsky, 1991). Given a universe of exemplars U, the familiarity of a new instance i can be defined as follows F(i|U) = 1 − (1 − j∈U ∏ Sim(i, j) ) (9) where Sim(i,j) is a measure of the similarity between instance i and instance j. As in equation 5, the format of equation 9 is such that comparisons in which the instances are highly similar provide positive evidence that instance i is familiar, but An exemplar model of classification 19 comparisons in which instances are dissimilar do not provide negative evidence for familiarity. If instance i is identical to any given instance k (i.e. Sim(i,k) = 1) then instance i will be fully familiar (F(i|U) = 1). If instance i has a high degree of similarity to a set of X stored instances (Sim(i,k) = Shigh when k is a member of set X) then instance i will have a degree of familiarity proportional to the number of instances in X and the degree of similarity Shigh ( F(i|U) ~ 1 (1 Shigh)X ); the larger the set of instances X, and the higher the degree of similarity Shigh, the more familiar i will be. Only if i is fully dissimilar to all instances in the universe U (i.e. Sim(i,k) = 0 for all k in U) will the familiarity of instance i be zero. Equation 9, of course, requires some way of computing the similarity of two instances. The "diagnosticity and plausible evidence" model follows other exemplar models and computes similarity via the product rule : if instances i and j have N attributes i1..iN and j1..jN then sim(i,j) is given as in equation 10: sim(i, j) = (1−| if − jf f =1 N ∏ |+Smin) (10) where Smin is a constant which measures of the importance of any mismatched attribute in the similarity comparison (see Medin & Schaffer, 1978). We can unite equation 5 and equation 9 to produce a general equation for membership and typicality in simple and combined categories. Given a universe of exemplars U, the membership typicality M of a new instance i for a (simple or combined) category C1...CN is M(i|C1...CN|U) = F(i|U).E(i|C1...CN | U C1∪...∪CN) (11) This equation is the essential statement of the "diagnostic and plausible evidence" model of simple and combined categorisation. This equation describes the typicality of instances in simple (N=1) or combined (N>1) categories. This equation states that an instance i's membership typicality for a (simple or combined) category C1...CN is proportional to i's familiarity relative to the set of known exemplars of all An exemplar model of classification 20 categories, and proportional to the degree of evidence which i's attributes provide that i is a member of each of the combining categories C1...CN. _______________________________________________ Table 7 about here. _______________________________________________ To illustrate the contribution of diagnostic evidence and plausibility to categorisation, consider the typicality of three different novel instances for the empty combination "pet lobster" (Table 7). Each of these instances has a certain degree of diagnostic evidence for membership in the combined category (computed as for the instance in Table 4). These instances also have varying degrees of plausibility, with exemplar A (which describes a lobster kept in a tank in a house) being more plausible than exemplar B (which describes a lobster kept in a bowl). Exemplar A is thus a more typical instance of the combination "pet lobster" than exemplar B, with exemplar C coming a poor third. Appendix A shows the computation of plausibility for the three exemplars A, B and C, relative to the set of stored exemplars. The similarity between each "pet lobster" instance and each stored exemplar is computed as in equation 10 (with Smin = 0.5), and the plausibility of the instances is computed as in equation 9. It can be seen from this table that exemplar A has the highest degree of plausibility (0.95) essentially because lobsters are known to be kept in tanks but not in bowls. Explaining two specific results in overextension: minimal overlap and category dominance. In presenting the "diagnostic and plausible evidence" model, I've described in general terms how the model can account of the family resemblance structure of simple categories, for typicality judgements in "empty" combinations, and for the overextension of combined categories. In this section I'll discuss the model's account of overextension in more detail, showing how the model explains two specific findings: that the degree of overextension in combined categories is highest for those categories An exemplar model of classification 21 with minimal overlap (Hampton, 1988); that the category with the higher degree of relative overlap in a combination often dominates typicality judgements for the combination (Storms et al, 1996). Overextension and minimal overlap. Hampton's (1988) study of overextension in combined categories found a relationship between the overextension of combinations and the degree of overlap between the categories being combined. The lower the degree of overlap between the categories being combined (the fewer exemplars the categories had in common) the more likely those combinations were to be overextended. This finding fits with the suggestion made by a number of researchers that overextension occurs when typicality judgements in combined categories are "recalibrated" to distribute judgements evenly across a scale: categories with a low degree of overlap require a higher degree of recalibration and hence are more likely to show overextension (Kamp & Partee, 1995; Huttenlocher & Hedges, 1994). The "diagnostic and plausible evidence" model gives a natural explanation for this relationship between decreasing degree of overlap and rising overextension. As shown earlier, overextension arises in the model because of differences in contrast set size between simple and combined categories. The contrast set for a simple category is the set of all instances not members of the category; the contrast set for a combined category is the set of all instances not members of any category in the combination. If the constituent categories in a combined category have a high degree of overlap, there will be little difference between the contrast set for that combined category and the contrast sets of its constituents; and hence the degree of overextension will be small. If the constituents of a combination have a minimal degree of overlap, there will be a large difference between the contrast set for that combined category and the contrast sets of its constituents; and hence the degree of overextension will be high. An exemplar model of classification 22 This is described more formally in equation 12. Let A and B be two combining categories; let i be an instance which a certain degree of membership in categories A and B. From equation 6, the degree of evidence for instance i's membership in the combination AB is given by E(i| A | U A ∪ B) * E(i| B | U A ∪ B), and the degree of evidence for i's membership in categories A and B on their own is given by E(i | A | U A) and E(i | B| U B). The degree of overextension for categories A and B in the combination AB is equal to the difference between the degree of evidence for i's membership of the category by itself, and the degree of evidence for i's membership of the category when it occurs in the combination. If A' and B' are the sets of instances which are members of categories A and B only and not members of their intersection (A' = A A ∩ B; B' = B A ∩ B), then A ∪ B= A ∪ B', and A ∪ B= B ∪ A', and the degrees of overextension for categories A and B are overextension(A, AB) = E(i | A | U A ∪ B') E(i | A | U A ) overextension(B, AB) = E(i | B | U B ∪ A') E(i | B | U B) (12) All else being equal, the only factor influencing the degree of overextension for a given category in a combination is the size of the contrast set for that category in the combination. For category A the degree of overextension rises as the size of B' rises; or putting it another way, since B' = B O, the degree of overextension for category A rises as the size of O (the degree of overlap) falls. For category B, similarly, the degree of overextension rises as the degree of overlap falls. _______________________________________________ Table 8 about here. _______________________________________________ This influence of overlap on overextension is illustrated in Table 8, which shows the membership typicality of the instance "labrador" in the combined category "pet dog" and the constituent categories "pet" and "dog" (computed as in equation 11). In Table 1, the categories "pet" and "dog" have a high degree of overlap (almost all An exemplar model of classification 23 instances of "dog" are also instances of "pet"). Table 8 shows that typicality of “labrador” in the combination "pet dog" is not overextended, relative to typicality in the constituents “pet” and “dog”: the degree of typicality of the instance in the combination (0.92), while greater than its typicality in the constituent “pet” (0.89), is less than its typicality in the constituent “dog” (0.99). This contrasts with the situation for the combination "pet fish", which has a small degree of overlap (only one instance of "fish" is also an instance of "pet"). Table 6 shows that typicality of the instance “goldfish” in that combination is highly overextended: the degree of typicality of the instance in the combination (0.93), is greater than its typicality in the constituent “pet” (0.9),and the constituent “fish” (0.9). Dominance and relative overlap. A number of studies have shown a dominance effect in typicality judgements for combined categories, with one category in a combination being a significantly better predictor of instance typicality in the combination than the other category (Hampton, 1987, 1988; Chater, Lyon, & Myers, 1990). Storms et al (1996) found that this dominance effect reliably occurred in a number of different tasks (membership ratings, exemplar generation and category naming). Importantly, the dominant constituent category in a combination was typically the one with the largest relative overlap with the other constituent category. In other words, the dominant category in a combination was the category with the fewest exemplars outside the intersection of the two combining categories. Again, the "diagnostic and plausible evidence" model gives a natural account for these findings. In the model the membership typicality of an instance for a category is dependent on the diagnostic attributes that instance has for the category (equation 5). Across a range of instances, membership typicality in one category will be a good predictor of membership typicality in another category if the two categories share the same diagnostic attributes. The degree to which two categories share the same diagnostic attributes is dependent on the degree to which the categories share the same An exemplar model of classification 24 stored exemplars: if the two categories share many exemplars they will have many of the same diagnostic attributes, if they share few exemplars they many have many different diagnostic attributes. In the case of category combination, membership typicality in a constituent category A will be a good predictor of membership in a combined category AB if the same attributes are diagnostic for the constituent category A and for the combination AB. The same attributes will be diagnostic for a constituent category A and a combination AB if A and AB share many exemplars; that is, if A has a large relative overlap with the other constituent of the combination, and has few exemplars which fall outside the intersection of the two combining categories. This effect of dominance is illustrated in Appendix B, which shows the membership typicality of all stored exemplars in the categories "pet", "dog", and "pet dog" (computed according to equation 11). This table shows that membership typicality in "dog" is a better predictor of membership typicality in "pet dog" than membership in "pet"; since "pet" has a number of exemplars which fall outside the category "dog", but "dog" has few exemplars which fall outside the category "pet". The degree of correlation between exemplar typicality in the constituent category "dog" and typicality in the combination "pet dog" is highest (0.98) while correlation between typicality in the constituent category "pet" and the combination "pet dog" is lower (0.9). From typicality to inference: Access to "emergent" attributes. Up to now the focus in this paper has been on the "diagnostic and plausible evidence" model as an account of people's typicality judgements in simple and combined categories. In this section I illustrate how the model can be extended to account for people's inferences in simple and combined categories. In particular I'll show how the model can explain the interesting finding that people often infer "emergent" attributes for combined categories (attributes which are true of the combination but not of its constituent categories on their own) as fast or faster than "noun" attributes (attributes both true of the combination and true of its constituents; Springer & Murphy, 1992; Gagné & Murphy, 1996). In the model, speed of attribute An exemplar model of classification 25 inference for a category is taken to be proportional to attribute diagnosticity for that category. In the case of combined categories, "emergent" attributes (true of the combination but not its constituent on their own) are often more diagnostic for the combined category than "noun" attributes (true of both combination and constituents). Highly diagnostic emergent attributes are thus available for inference faster than less diagnostic "noun" attributes. A number of researchers have investigated the speed with which people can access attributes for category combinations and for simple categories. Typically speed of access is examined in an attribute verification task, in which participants are presented with a (simple or combined) category, and an attribute which is either true or false of that category. Participants are asked to verify whether the attribute is true or false of the presented category, and their speed of response and error rate are recorded. Springer & Murphy (1992) presented participants with adjective-noun combinations such as "hard cake" and asked them to verify attributes of those combinations. Some attributes were "noun" attributes, such as "hard cake is sweet", which applied to both the noun by itself and to the combination (both cakes and hard cakes are sweet). Others were "emergent" attributes, such as "hard cake is stale", which applied solely to the combination, but not to the noun on its own (cakes are generally not stale, but hard cakes are). Under the assumption that the meanings for combined categories are constructed out of the meanings of their constituent parts, Springer & Murphy expected people to verify noun attributes faster than emergent attributes, because emergent attributes would only become available once the meaning of the combined category had been processed, while noun attributes would be directly available from the initial presentation of the noun. However, they found the contrary result: people reliably verified emergent attributes such as "hard cake is stale" at least as fast, and often faster, than they verified noun attributes such as "hard cake is sweet". While this result at first seems somewhat paradoxical, the "diagnostic and plausible evidence" model accounts for it very easily. To apply the model to category An exemplar model of classification 26 inference, as opposed to typicality judgements, I make use of Este's idea of a "configural prototype" for a category (Estes, 1994). The configural prototype for a category is that instance, out of the set of all possible instances, which the highest membership typicality for the category (the set of all possible instances for a category can be formed by combining all possible attributes in all possible ways). The attributes of the configural prototype for a category correspond to the attributes which would be inferred as typical of that category. The configural prototype for a category may correspond to some specific highly typical exemplar of the category which has been seen previously, or it may represent an average or summary across typical exemplars of the category. In the "diagnostic and plausible evidence" model the configural prototype for a (simple or combined) category is that instance, out of the set of all possible instances, which has the highest value in equation 12. A number of researchers have proposed that access to attributes in a category is facilitated if those attributes are highly diagnostic of the category (e.g. Barsalou, 1982). In the "diagnostic and plausible evidence" model, speed of access to configural prototype attributes is taken to be directly proportional to the diagnosticity of those attributes for the (simple or combined) categories in question. In the configural prototype for a simple category, attributes with high diagnosticity for the category are available faster than attributes with lower diagnosticity. In the configural prototype for a combined category, attributes with high average diagnosticity for the two combining categories are available faster than attributes with lower average diagnosticity. Emergent attributes for combined categories often have a higher degree of average diagnosticity than noun attributes, and are thus more rapidly available for inference. _______________________________________________ Table 9 about here. _______________________________________________ The formation of configural prototypes in the "diagnostic and plausible evidence" model is illustrated using the set of stored exemplars of the categories "cake", An exemplar model of classification 27 "fruit", and "hard" shown in Table 9. Some attributes in this table are noun attributes, true of all exemplars of the noun categories ("sweet" is a noun attribute of the category "cake": all exemplars of "cake" have the attribute "sweet"). Other attributes are emergent attributes, true of instances of the combined categories, but not true of the noun categories on their own ("stale" is an emergent attribute of the combination "hard cake": only instances of the combination have the attribute "stale") The configural prototypes for the categories “cake”, “fruit”, "hard cake" and "hard fruit" are shown in Table 10 (these were formed by making all possible permutations of attributes used in Table 9, and finding those with the highest membership typicalities for the categories). First, its clear that different attributes are inferred for the simple and combined categories: where the prototype for “cake” is fresh, the prototype for “hard cake” is stale; where the prototype for “fruit” is fresh and sweet, the prototype for “hard fruit” is unripe and bitter. Further, considering the average diagnosticity of attributes for combined category prototypes it is clear that emergent attributes ("stale" in "hard cake"; "bitter" in "hard fruit") have a higher average degree of diagnosticity than noun attributes ("sweet" in "hard cake"; "round" in "hard fruit"). These emergent attributes are thus available faster than the less diagnostic noun attributes. _______________________________________________ Table 10 about here. _______________________________________________
منابع مشابه
Modeling Typicality: Extending the Prototype View
In this study we re-evaluated earlier findings that natural language categories are represented by an exemplar representation rather than by a prototype representation. Using a restricted prototype model, an exemplar model and a “flying” prototype model, we predicted typicality ratings for 11 natural language categories from 2 semantic domains, animals and artifacts. We showed that exemplar mod...
متن کاملAn Exemplar Model of Classification in Single and Combined Categories
This paper describes an exemplar-based model of people’s classification and typicality judgements in both single and combined categories. This model, called the diagnostic evidence model, explains the observed family resemblance structure of single categories; the productive nature of category combination; the observed overextension of typicality judgments in some combined categories; and the s...
متن کاملThe instantiation principle in natural categories.
According to the instantiation principle, the representation of a category includes detailed information about its diverse range of instances. Many accounts of categorisation, including classical and standard prototype theories, do not follow the instantiation principle, because they assume that detailed, exemplar-level information is filtered out of category representations. Nevertheless, the ...
متن کاملExemplar-Based Accounts of Relations Between Classification, Recognition, and Typicality
Previously published sets of classification and old-new recognition memory data are reanalyzed within the framework of an exemplar-based generalization model. The key assumption in the model is that, whereas classification decisions are based on the similarity of a probe to exemplars of a target category relative to exemplars of contrast categories, recognition decisions are based on overall su...
متن کاملTypicality in logically defined categories: exemplar-similarity versus rule instantiation.
A rule-instantiation model and a similarity-to-exemplars model were contrasted in terms of their predictions of typicality judgments and speeded classifications for members of logically defined categories. In Experiment 1, subjects learned a unidimensional rule based on the size of objects. It was assumed that items that maximally instantiated the rule were those farthest from the category boun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998